Fast speech adaptation in linear spectral domain for additive and convolutional noise

نویسندگان

  • Dongsuk Yook
  • Donghyun Kim
چکیده

In this paper, we propose a transform-based adaptation technique for robust speech recognition in unknown environments. It uses maximum likelihood spectral transform (MLST) algorithm with additive and convolutional noise parameters. Previously many adaptation algorithms have been proposed in the cepstral domain. Though the cepstral domain may be appropriate for the speech recognition, it is difficult to handle environmental noise directly in the cepstral domain. Therefore our approach deals with such noise in the linear spectral domain in which speech is directly affected by the noise. As a result, we can use a small number of noise parameters for fast adaptation. The experiments evaluated on the FFMTIMIT corpus shows promising result with only a small number of adaptation data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Noise suppression and loudness normalization in an auditory model-based acoustic front-end

It is commonly acknowledged that the presence of additive and convolutional noise and speech level variations can seriously deteriorate the performance of a speech recognizer. In case an auditory model is used as the acoustic front-end, it turns out that compensation techniques such as spectral subtraction and log-spectral mean subtraction can be outperformed by time-domain techniques operating...

متن کامل

Noise and room acoustics distorted speech recognition by HMM composition

This paper presents a robust speech recognition method based on the HMM composition for the noisy room acoustics distorted speech. The method realizes an improved user interface such as the user is not encumbered by microphone equipments. The proposed HMM composition is obtained by naturally extending the HMM composition method of an additive noise to that of the convolutional room acoustics di...

متن کامل

A New Method for Speech Enhancement Based on Incoherent Model Learning in Wavelet Transform Domain

Quality of speech signal significantly reduces in the presence of environmental noise signals and leads to the imperfect performance of hearing aid devices, automatic speech recognition systems, and mobile phones. In this paper, the single channel speech enhancement of the corrupted signals by the additive noise signals is considered. A dictionary-based algorithm is proposed to train the speech...

متن کامل

Forward masking on a generalized logarithmic scale for robust speech recognition

This paper examines the forward masking on the generalized logarithmic scale for robust speech recognition to both additive and convolutional noise. The forward masking in the dynamic cepstral (DyC) representation is based upon subtraction of a masking pattern from a current spectrum on a logarithmic spectral domain, whereas the proposed method intends to make a compromise between the logarithm...

متن کامل

Uncertainty in Signal Estimation and Stochastic Weighted Viterbi Algorithm: A Unified Framework to Address Robustness in Speech Recognition and Speaker Verification

Robustness to noise and low-bit rate coding distortion is one of the main problems faced by automatic speech recognition (ASR) and speaker verification (SV) systems in real applications. Usually, ASR and SV models are trained with speech signals recorded in conditions that are different from testing environments. This mismatch between training and testing can lead to unacceptable error rates. N...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004